TAGARAB: A Fast, Accurate Arabic Name Recognizer Using High-Precision Morphological Analysis

نویسندگان

  • John Maloney
  • Michael Niv
چکیده

We describe a fast, high-performance name recognizer for Arabic texts. It combines a patternmatching engine and supporting data with a morphological analysis component. The role of the morphological analysis in accurate name recognition is discussed. VCe also provide evaluations of both morphological analysis and name recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Named Entities Using Named Entity Recognizer and Generating Topics Using Latent Dirichlet Allocation Algorithm for Arabic News Articles

This paper explains for the Arabic language, how to extract named entities and topics from news articles. Due to the lack of high quality tools for Named Entity Recognition (NER) and topic identification for Arabic, we have built an Arabic NER (RenA) and an Arabic topic extraction tool using the popular LDA algorithm (ALDA). NER involves extracting information and identifying types, such as nam...

متن کامل

Exploiting multilingual wikipedia to improve arabic named entity resources

This paper focuses on the creation of Arabic named entity gazetteers, by exploiting Wikipedia and using the Naïve Bayes classifier to classify the named entities into the three main categories: person, location, and organization. The process of building the gazetteer starts with automatically creating the datasets. The dataset for the training is constructed using only Arabic text, whereas, the...

متن کامل

Simple, Fast and Reliable Liquid Chromatographic and Spectrophotometric Methods for the Determination of Theophylline in Urine, Saliva and Plasma Samples

In this study, a high-performance liquid chromatographic method (HPLC) and UV spectrophotometric method were developed, validated and applied for the determination of theophylline in biological fluids. Liquid- liquid extraction is performed for isolation of the drug and elimination of plasma and saliva interferences. Urine samples were applied without any extraction. The chromatographic separat...

متن کامل

Simple, Fast and Reliable Liquid Chromatographic and Spectrophotometric Methods for the Determination of Theophylline in Urine, Saliva and Plasma Samples

In this study, a high-performance liquid chromatographic method (HPLC) and UV spectrophotometric method were developed, validated and applied for the determination of theophylline in biological fluids. Liquid- liquid extraction is performed for isolation of the drug and elimination of plasma and saliva interferences. Urine samples were applied without any extraction. The chromatographic separat...

متن کامل

Full Automatic Arabic Text Tagging System

Part-of-Speech tagging is the process of assigning grammatical part-of-speech tags to words based on their context. Many automated tagging systems have been developed for English and many other western languages, and for some Asian languages, and have achieved accuracy rates ranging from 95% to 98%. A tagged corpus has more useful information than untagged corpus; so, tagged corpus can be used ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998